A Data-Driven Dependency Parser for Bulgarian

نویسندگان

  • Svetoslav Marinov
  • Joakim Nivre
چکیده

One of the main motivations for building treebanks is that they facilitate the development of syntactic parsers, by providing realistic data for evaluation as well as inductive learning. In this paper we present what we believe to be the first robust data-driven parser for Bulgarian, trained and evaluated on data from BulTreeBank (Simov et al., 2002). The parser uses dependency-based representations and employs a deterministic algorithm to construct dependency structures in a single pass over the input string, guided by a memory-based classifier at each nondeterministic choice point, as described in Nivre et al. (2004). Since the original BulTreeBank annotation is based on HPSG, it has been necessary to extract dependency structures from the original annotation both for training and evaluating the parser. The paper is structured as follows. Section 2 introduces the MaltParser system used in the experiments to induce parsers from treebank data. Section 3 presents BulTreeBank, and section 4 shows how the BulTreeBank annotation can be converted into a dependency structure annotation of the kind required by MaltParser. Section 5 describes the experimental conditions, and section 6 discusses the results of the experiments. Section 7 contains our conclusions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Minimal Recursion Semantics over Bulgarian Dependency Parsing

The paper discusses the transferring rules of the output from a dependency parser for Bulgarian into RMRS analyses. This task is required by the machine translation compatibility between Bulgarian and English resources. Since the Bulgarian HPSG grammar is still being developed, a repairing mechanism has been envisaged by parsing the Bulgarian data with the Malt Dependency Parser, and then retri...

متن کامل

Feature Engineering in Persian Dependency Parser

Dependency parser is one of the most important fundamental tools in the natural language processing, which extracts structure of sentences and determines the relations between words based on the dependency grammar. The dependency parser is proper for free order languages, such as Persian. In this paper, data-driven dependency parser has been developed with the help of phrase-structure parser fo...

متن کامل

MaltParser: A Data-Driven Parser-Generator for Dependency Parsing

We introduce MaltParser, a data-driven parser generator for dependency parsing. Given a treebank in dependency format, MaltParser can be used to induce a parser for the language of the treebank. MaltParser supports several parsing algorithms and learning algorithms, and allows user-defined feature models, consisting of arbitrary combinations of lexical features, part-of-speech features and depe...

متن کامل

Improving data-driven dependency parsing using large-scale LFG grammars

This paper presents experiments which combine a grammar-driven and a datadriven parser. We show how the conversion of LFG output to dependency representation allows for a technique of parser stacking, whereby the output of the grammar-driven parser supplies features for a data-driven dependency parser. We evaluate on English and German and show significant improvements stemming from the propose...

متن کامل

Partial Parsing from Bitext Projections

Recent work has shown how a parallel corpus can be leveraged to build syntactic parser for a target language by projecting automatic source parse onto the target sentence using word alignments. The projected target dependency parses are not always fully connected to be useful for training traditional dependency parsers. In this paper, we present a greedy non-directional parsing algorithm which ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005